Goto

Collaborating Authors

 straight line



Behold the Manifold, the Concept that Changed How Mathematicians View Space

WIRED

In the mid-19th century, Bernhard Riemann conceived of a new way to think about mathematical spaces, providing the foundation for modern geometry and physics. Standing in the middle of a field, we can easily forget that we live on a round planet. We're so small in comparison to the Earth that from our point of view, it looks flat. The world is full of such shapes--ones that look flat to an ant living on them, even though they might have a more complicated global structure. Mathematicians call these shapes manifolds.



Gini Score under Ties and Case Weights

Brauer, Alexej, Wüthrich, Mario V.

arXiv.org Machine Learning

The Gini score is a popular statistical tool in model validation. The Gini score has originally been introduced and used for binary responses Y {0, 1}, and there are many equivalent formulations of the (binary) Gini score such as the receiver operating curve (ROC) and the area under the curve (AUC); see, e.g., [Bamber (1975)], [Hanley-McNeil (1982)] and [Fawcett (2006)]. These different formulations are also equivalent to the Wilcoxon-Mann-Whitney's U statistic, see [Hanley-McNeil (1982)], [DeLong et al. (1988)], [Byrne (2016)], and to [Somers (1962)]'s D, see [Newson (2002)]. Thus, there are at least five equivalent formulations of the Gini score in a binary context, and there is a broad literature on its behavior which is well understood. When it comes to general real-valued responses, things become more difficult, and definitions and results on the Gini score are mainly found in the credit risk and actuarial literature. In this stream of literature, the Gini score has been introduced by [Gourieroux-Jasiak (2007)], [Frees et al. (2011), Frees et al. (2013)]. Furthermore, in the real-valued setting the Gini score is studied in much detail in [Denuit et al. (2019)] and [Denuit-Trufin (2021)]. The Gini score is a statistic that assesses whether a given risk ranking is correct.


Geodesics in the Deep Linear Network

Chen, Alan

arXiv.org Artificial Intelligence

We derive a general system of ODEs and associated explicit solutions in a special case for geodesics between full rank matrices in the deep linear network geometry. In the process, we characterize all horizontal straight lines in the invariant balanced manifold that remain geodesics under Riemannian submersion.



Graph-based Integrated Gradients for Explaining Graph Neural Networks

Simpson, Lachlan, Millar, Kyle, Cheng, Adriel, Lim, Cheng-Chew, Chew, Hong Gunn

arXiv.org Artificial Intelligence

Integrated Gradients (IG) is a common explainability technique to address the black-box problem of neural networks. Integrated gradients assumes continuous data. Graphs are discrete structures making IG ill-suited to graphs. In this work, we introduce graph-based integrated gradients (GB-IG); an extension of IG to graphs. We demonstrate on four synthetic datasets that GB-IG accurately identifies crucial structural components of the graph used in classification tasks. We further demonstrate on three prevalent real-world graph datasets that GB-IG outperforms IG in highlighting important features for node classification tasks.



Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases

Neural Information Processing Systems

Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment.


High-Dimensional Data Classification in Concentric Coordinates

Williams, Alice, Kovalerchuk, Boris

arXiv.org Artificial Intelligence

Alice Williams Department of Computer Science Central Washington University USA 0009 - 0001 - 6154 - 2407 Boris Kovalerchuk Department of Computer Science Central Washington University USA 0000 - 0002 - 0995 - 9539 Abstract -- The v isualization of multi - dimensional data with interpretable methods remains limited by ca pabilities for both high - dimensional lossless visualizations that do not suffer from occlusion and that are computationally capable by parameterized visualization . This paper proposes a low to high dimensional data supporting framework using lossless C oncentric C oordinates that a re a more compact generalization of Parallel Coordinate s along with former C ircular C oordinates . These are forms of the General Line Coordinate visualizations that can directly support machine learning algorithm visualization and facilitate human inter action . A. Motivation In many domains, accurate and interpretable classification models can be accurately visualized. However, in many other domains, this remains a long - standing and critical roadblock to deploy artificial intelligence and machine learning (AI/ML) models. This is critica l and challenging for high - risk tasks like healthcare diagnostics. Visualization of multidimensional (n - D) data classification is critical for three major reasons: (1) to speed up analysis of prediction accuracy, (2) to interpret/explain classifier predictions, and (3) to improve/modify the prediction model. B. Overview of Existing Methods AI/ ML tasks for high multi - dimensional (n - D) data are commonly approached with black - box deep - learning (DL) methods that inherently lack in i nterpretability and decision explanation. Further relying on explainability after model design as popularly done with either LIME or SHAP [ 7 ]. Moreover, visualization methods used commonly pre process data with dimensional reduction (DR) methods like Principal Component Analysis (PCA), t - Stochastic Neighbor Embedding (t - SNE), or other similar approximations. However, s uch methods are lossy and not reversible. Therefore, these methods commonly introduce visual ly verify inaccuracies in n - D. Alternatively, lossless visualizations allow for the use of Visual Knowledge Discovery (VKD) to visually discover algorithmic adjustments that improve ML prediction models [ 5 ] .